Credit Card Users Churn Prediction

Description

Background & Context

The Thera bank recently saw a steep decline in the number of users of their credit card, credit cards are a good source of income for banks because of different kinds of fees charged by the banks like annual fees, balance transfer fees, and cash advance fees, late payment fees, foreign transaction fees, and others. Some fees are charged to every user irrespective of usage, while others are charged under specified circumstances.

Customers’ leaving credit cards services would lead bank to loss, so the bank wants to analyze the data of customers and identify the customers who will leave their credit card services and reason for same – so that bank could improve upon those areas

You as a Data scientist at Thera bank need to come up with a classification model that will help the bank improve its services so that customers do not renounce their credit cards

You need to identify the best possible model that will give the required performance

Objective

Explore and visualize the dataset. Build a classification model to predict if the customer is going to churn or not Optimize the model using appropriate techniques Generate a set of insights and recommendations that will help the bank

Data Dictionary:

Import Libraries

Exploratory Data Analysis and Insights

Load and view dataset

Bivariate Analysis

Summary EDA

Data pre-processing

Model building

Splitting the data

Logistic Regression

BaggingClassifier

DecisionTreeClassifier

AdaBoostClassifier

GradientBoostingClassifier

RandomForestClassifier

Model building - Oversampled data

Model building - Undersampled data

Hyperparameter tuning using random search

Choosing AdaBoost, Gradient Boost, Random Forest oversampled as all these have high recall but closer precision. Hence, trying to find best option here.

Model Performances

On hyperparameter tuning, the Ada Boost gives good recall of 91.6% while maintaining precision of 90% and can be considered as best performance. We also do have other good options in models like gradient boost and Random forest but from overall recall with precision current seems good

Productionize the model

Actionable Insights & Recommendations

Business recommendations and insights

Thank you - Amogh